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Abstract 

In this paper, we analyze the impact of compressed sensing with complex random matrices on Fisher 
information and the Cramer-Rao Bound (CRB) for estimating unknown parameters in the mean value 
function of a complex multivariate normal distribution. We consider the class of random compression 
matrices whose distribution is right-orthogonally invariant. The compression matrix whose elements are 
i.i.d. standard normal random variables is one such matrix. We show that for all such compression 
matrices, the Fisher information matrix has a complex matrix beta distribution. We also derive the 
distribution of CRB. These distributions can be used to quantify the loss in CRB as a function of the Fisher 
information of the non-compressed data. In our numerical examples, we consider a direction of arrival 
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estimation problem and discuss the use of these distributions as guidelines for choosing compression 
ratios based on the resulting loss in CRB. 
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I. Introduction 

Inversion of a measurement for its underlying modes is an important topic which has applications in 
communications, radar/sonar signal processing and optical imaging. Classical methods for inversion are 
based on maximum likelihood, variations on linear prediction, suhspace filtering, etc. Compressed sensing 
ffl-lEl is a relatively new theory which exploits sparse representations and sparse recovery for inversion. 

In our previous work iSl-Q, the sensitivity of sparse inversion algorithms to basis mismatch and frame 
mismatch were studied. Our results show that mismatch between the actual basis in which a signal has 
a sparse representation and the basis (or frame) which is used for sparsity in a sparse reconstruction 
algorithm, e.g., basis pursuit, has performance consequences on the estimated parameter vector. 

This paper addresses another fundamental question: How much information is retained (or lost) in 
compressed noisy measurements for nonlinear parameter estimation? To answer this question, we analyze 
the effect of compressed sensing on the Fisher information matrix and the Cramer-Rao bound (CRB). 
We derive the distribution of the Fisher information matrix and the CRB for the class of random matrices 
whose distributions are invariant under right-orthogonal transformations. These include i.i.d draws of 
spherically invariant matrix rows, including, for example, i.i.d. draws of standard normal matrix elements. 

Our prior work on compressed sensing and the Fisher information matrix O, Q contain numerical 
results that characterize the increase in CRB after random compression for the case where the parameters 
nonlinearly modulate the mean in a multivariate normal measurement model. Also, in lUl we derived 
analytical lower and upper bounds on the CRB for the same problem, and used our bounds to quantify 
the potential loss in CRB. 

Other studies on the effect of compressed sensing on the CRB and the Fisher information matrix include 
ll9l- llTn . Babadi et al. Q proposed a “Joint Typicality Estimator” to show the existence of an estimator 
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that asymptotically achieves the CRB of sparse parameter estimation for random Gaussian compression 
matrices. Niazadeh el al. lITOl generalize the results of @ to a class of random compression matrices 
which satisfy the concentration of measures inequality. Nielsen et al. ifT^ derive the mean of the Fisher 
information matrix for the same class of random compression matrices that we are considering. Ramasamy 
et al. lITTll derive hounds on the Fisher information matrix, hut not for the model we are considering. We 
will clarify the distinction between our work and iHTl after establishing our notation in Section 

II. Problem statement 

Let y G be a complex random vector whose probability density function /(y; 0 ) is parameterized 
by an unknown but deterministic parameter vector 0 G M^. The derivative of the log-likelihood function 
with respect to 6 = [0i, 02, •'' i is called the Fisher score, and the covariance matrix of the Fisher 
score is the Fisher information matrix which we denote by J(0): 

j(9, = (1) 

The inverse of the Fisher information matrix lower bounds the error covariance matrix for any 

unbiased estimator 6{y) of 6, that is 

E[{e{y)-e){e{y)-e)^]^3~\e), (2) 

where A ^ B for matrices A,B G means a^Aa > a^Ba for all a G C”. The diagonal 

element of J~^(0) is the CRB for estimating 0* and it gives a lower bound on the MSE of any unbiased 
estimator of 6i from y (see, e.g., uni). 

For y G a proper complex normal random vector distributed as CA/’(x(0), C) with unknown mean 
vector x(0) parameterized by 6, and known covariance C = a‘^1, the Fisher information matrix is the 
Grammian 

j(6») = G^C-^G = ^G^G. (3) 

The column g* of G = [gi,g2,-'' j gp] is the partial derivative gj = ^x(0), which characterizes 
the sensitivity of the mean vector x(0) to variation of the parameter 0*. 

The CRB for estimating 0* is given by 

(J-i(0))** = ^'(g*"'(I-PGjgO“\ (4) 
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where Gj consists of all columns of G except gj, and Pq; is the orthogonal projection onto the column 
space of Gj |[T4l . This CRB can also he written as 


^-Ho))u = 


a 


(5) 


|gi||2sin2(V^i)’ 

where ipi is the principal angle between suhspaces (gj) and (Gj). These representations illuminate the 
geometry of the CRB, which is discussed in detail in llTdll . 


If y is compressed hy a compression matrix $ G (j^mxn produce y = $y, then the prohahility density 
function of the compressed data y is CAA[$x(0), The Fisher information matrix J(0) is given 

by 


J(6») = ^G^G, 

(7^ 

where G = P^hG. The CRB for estimating the parameter is 


( 6 ) 




a 


'(gf(I 



(7) 


where Gj = P^wGj, and P^h = is the orthogonal projection onto the row span of 


Our aim is to study the effect of random compression on the Fisher information matrix and the CRB. 


In section III we investigate this problem by deriving the distributions of the Fisher information matrix 
and the CRB for the case in which the elements of the compression matrix are distributed as i.i.d. 
standard normal random variables. Then we demonstrate that the same analysis holds for a wider range 
of random compression matrices. 


Remark 1: In parallel to our work, Ramasamy et al. iHTl have also looked at the impact of compression on 
Fisher information. However, they have considered a different parameter model. Specifically, their com¬ 
pressed data has density CA/’[$x(0), in contrast to ours which is distributed as CAA[$x(0), 

Our model is a signal-plus-noise model, wherein the noisy signal x(0) -|- n, n ~ is 

compressed to produce $x(0) -I- $n. In contrast, their model corresponds to compressing a noiseless 
signal x(0) to produce $x(0) -|- w, where w ~ CA/’(0, cr^I) represents post-compression noise. Note 
that the Fisher information, CRB and corresponding bounds of these two models are different, as in 
our model noise enters at the input of the compressor, whereas in ifTTl noise enters at the output of the 
compressor. This is an important distinction. 
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III. Distribution of the Fisher information Matrix after compression 


Let W be the normalized Fisher information matrix after compression, defined as 

W = J-1/2JJ-H/2 ^ f^pxp^ 

where J and J are the Fisher information matrix before compression and the Fisher information matrix 
after compression, defined in Q and Q, respecfively. Our aim is to derive the distribution of the random 
matrix W for the case that the elements of the compression matrix ^ij are i.i.d. random variables 
distributed as CJ\f{0, 1). We assume n — p > m, which is typical in almost all compression scenarios of 
interest. 


Using Q and Q, W may be written as 

W = 


(9) 


where H = G(G^G) is a left orthogonal matrix, i.e., H^H = Ip. Define H G p) such that 

A = [H|H] is an orthonormal basis for i.e., AA^ = A^A = I„. Then we have: 


W = 





0 


( 10 ) 


where 


= Pa»#»- (11) 

Because the distribution of $ is right orthogonally invariant, the distribution of A^P^h A is the same 
as the distribution of P^ff. Therefore, the distribution of W is the same as the distribution of V = 
where $ = [$i|$ 2 ], e and $2 e Now, write V as V = YY^, 

with Y = and Z = Since Z = + ^ 2^2 ^ and ^ 2^2 distributed as complex 

Wishart }Vc{lm,nT', n — p) for n — p > m, given $1 the pdf of Z is 

/(Z|$i) = |"-™-P. (12) 

The pdf of can be written as Therefore, the joint pdf of Z and is 

/(Z, $ 1 ) = C3e“‘'’(^)|Z - |”-”^-P (13) 
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where ci, C 2 , and C 3 = C 1 C 2 are normalization factors. Since Y 
Y and Z is 


from (131 the joint pdf of 


/(Y, Z) = C3e“*^(^)|Z - zV2y^YZ^/^|’"“™“P|Z|p 
= - y^Y|’"-”^-p|Z|"-™. 


(14) 


This shows that Y and Z are independent and the pdf of Y is 

/(Y) = C4|Im - Y^Y\^-^-P 

= C4|Ip-YY^|"-"*-P, (15) 

where C 4 is a normalization factor. 


Let p(YY^) = /(Y). To derive the distribution of V = YY^ we use the following theorem. 

Theorem 1: ifTSl If the density of Y G is g(YY^), then the density of V = YY^ is 

\Vr-Pg(V)7r^P 

fw \ 
rp(m) 

where rm( ) is the complex multivariate Gamma function. 

Using Theorem 1 and ( [T5] ), the pdf of V is 

C 5 |Vr“P|Ip-VP"”"“P for 0<V<Ip, (17) 

which is the Type I complex multivariate beta distribution CBp{m,n — m) for C 5 = - Recall 

that the distribution of the normalized Fisher information matrix after compression W = 
is identical to that of V. Therefore, W is also distributed as CBp{m, n — m). 

Remark 2: It is important to note that the distribution of W = is invariant to J, and it 

depends on only on the parameters {n — m) — p and m — p. In this sense, this result for the distribution 
of J-fo 2 jJ -^^/2 is universal, and reminiscent of the classical result of Reed, Mallat, and Brennan |[T^ 
for normalized SNR in adaptive filtering. 

Lemma 1: ifTTl Assume A G is a positive definite Hermitian random matrix with a pdf h{A). 
Then, the joint pdf of eigenvalues A = diag{Xi, A 2 ,..., Ap) of A is 

P{.P — 1 ) ^ f‘ 

Wv / h{VA\J^)dV, (18) 

Tpb) Jo(p) 

where dU is the invariant Haar measure on the Unitary group 0{p). 
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Using Lemma 1, we can derive the joint distribution of the eigenvalues of W. Replacing the pdf of 


W ~ CBp{m, n — m) in (18 1 , the joint pdf of the eigenvalues Ai, A 2 , ..., Ap of W = J is 

given by 


rp(p)rp(m)rp(n - m) fj: 


2 = 1 


Now, from (17) and using the transformation J = the Fisher information matrix after 

compression J is distributed as 

C5\J\P-^\jr-P\J - for 0<J<J (20) 

and the inverse of the Fisher information matrix after compression K = J~^ is distributed as 


C 5 |J|P"”|K|-”|JK - Ip|”-’"-P for K > 


( 21 ) 


Remark 3: For the class of random compression matrices that have density functions of the form 
that is, the distribution of $ is right orthogonally invariant, $^($$^)“i/2 is uniformly 
distributed on the Stiefel manifold Vm(C”) ifTSl . Therefore, the distribution of the normalized Fisher 
information matrix for this class of compression matrices is the same as the one given in ( [TT] ). 

Remark 4: Using the properties of a complex multivariate beta distribution llT9ll . we have: 

^ 777, 

E[J] = —J, (22) 

n 

and 

(23) 

m — p 

This shows that, on average, compression results in a factor loss in the Fisher information and a 
factor £ increase in the CRB 

m—p 

The distribution of the CRB after compression can be derived using the following Lemma. 

Lemma 2: llT9ll Assume X ~ C77p ( 01 , 02 ). Let z be a complex vector independent of X. Then, x = 
is distributed as Bp{ai — p + 1, 02 ), which is a Type I univariate beta distribution with the pdf 

rp(ai+ 02 ) ^^^_i(l_^)^^_i o<x<l. (24) 

)rp(tt2 J 

Now consider the CRB on an unbiased estimator of parameter 6i, after compression, normalized by the 
CRB before compression: 

(J-i)** _ ef J-iei _ z^W-iz 
(J-i)n efj-ie, z^^z ’ 
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where z = and e* G is a standard unit vector with 1 as its ith element and zeros as its other 

elements. By Lemma 1, the above ratio is distributed as the inverse of a univariate beta random variable 

{m — p + l,n — m). 


Remark 5: From the distribution of the CRB after compression, we have: 


var[(J“^) 


m — p 

(n — m){n — p) 


:((J 


-U.\2 

ll ) • 


(26) 

(27) 


[m — p — l){m — 

Remark 6: We can also look at the effect of compressed sensing on the Kullback Leibler (KL) divergence 
of two normal probability distributions, for the class of random compressors already discussed in Remark 
3. The KL divergence between CA((x(0), C) and CA((x(0'), C) is: 


D{e,e') = (x(0) -x(0O)^c-i(x(0) -x(0')). 


(28) 


After compression with $ we have 

b{e, e') = (x(0) - x(0'))^^^(^c$^)”^$(x(0) - x(0'))- (29) 

For the case C = ct^I, the normalized KL divergence is 

D { 0 , 0 ') ^ (x(0)-x(0O)^P^h(x(0)-x(0O) 

D{e,e') (x(0 )-x(0'))'^(x( 0)-x(0')) ‘ ^ 

Therefore, the normalized KL divergence, for random compression matrices $ whose distributions are 
invariant to right-orthogonal transformations, is distributed as i?^(m, n — m). 


IV. Numerical results 

As a special example, we consider the effect of compression on DOA estimation using a uniform line 
array with n elements. In our simulations, we consider two sources whose electrical angles 6i and 02 
are unknown. The mean vector x(0) is x(0) = x(0i) + x(02)> where 

^{Oi) = ••• (31) 

Here Ai and (pi are the amplitude and phase of the source, which we assume known. We set (pi = 
02 = 0 and Ai = A 2 = 1. We wish to estimate 6 * 1 , whose true value in this example is zero, in the 
presence of the interfering source at electrical angle 6*2 = vr/n (half the Rayleigh limit of the array). For 
our simulations, we use Gaussian compression matrices ^mxn whose elements are i.i.d. CA((0,l/m). 
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The Fisher information matrix and the CRB on the estimation of 9i are calculated for different realizations 
of Fig. [T] shows the CRB on the estimation of 9i before compression divided hy its corresponding 


value after compression, i.e. 




for m = 64, n = 128. A histogram of actual values of 




for 10^ independent realizations of random $ is shown in blue. The red curve represents the pdf of a 
B\m — p + l,n — m) distributed random variable for p = 2. This figure simply provides an illustration 


of the result (25). 



Fig. 1. Flistogram data and analytical distributions for using 10® realizations of i.i.d. Gaussian compression matrices 

with n = 128 and m = 64. 


Recall that the inverse Fisher Information matrix J~^ lower bounds the error covariance matrix 5] = 
E[ee^] for unbiased errors e = 0 — 6. So the concentration ellipse < e^Je for all e G C^. 

The ellipses e^Je = and e^Je, with = Jn, are illustrated in Fig. demonstrating the effect 
that compression inflates the concentration ellipse. The blue curve is the locus of all points e G C^, for 
which e^Je = r^. The red curves are the loci of all points e G C^, for which e^Je = for 100 
realizations of the Fisher information matrix after compression. As can be seen, the concentration ellipse 
for the Fisher information matrix before compression has the smallest volume in comparison with all the 
realizations of the concentration ellipses after compression. Also, for each realization of the Gaussian 
compression, the orientation of the concentration ellipse is nearly aligned with that of the uncompressed 
ellipse. 

Figure 1^ shows the compression ratio m/n needed so that the CRB after compression (J -')ii does not 
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Fig. 2. Concentration ellipses for the Fisher information matrices before and after compression. 


exceed k times the CRB before compression (J~^)ii, at two levels of confidence and for n = 128. These 
curves are plotted using the tail probabilities of a univariate beta random variables. They can be used as 
guidelines for deriving a satisfactory compression ratio based on a tolerable level of loss in the CRB. 
Alternatively, we can plot the confidence level curves versus m for fixed values of k . In that case, the 
plots may be useful to find a number of measurements that would guarantee that after compression CRB 
does not go above a desired bound (corresponding to a particular k) with a certain level of confidence. 
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Fig. 3. Compression ratios needed so that (J ^)ii < k(J ^)ii for different confidence levels. 
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V. Conclusion 

In this paper, we have studied the effect of random compression of noisy measurements on the CRB for 
estimating parameters in a nonlinear model. We have considered the class of random compression matrices 
whose distributions are right-orthogonally invariant. A random compression matrix with i.i.d. standard 
normal elements is one such compression matrix. The analytical distrihutions obtained in this paper can 
be used to quantify the amount of loss due to compression. Also, they can be used as guidelines for 
choosing a suitable compression ratio based on a tolerable loss in the CRB. Importantly, the distribution 
of the ratios of CRBs before and after compression depends only on the number of parameters and the 
number of measurements. The distribution is invariant to the underlying signal-plus-noise model, in the 
sense that it is invariant to the underlying (before compression) Fisher information matrix. 
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